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PHANTOM DUPIiEX COPY GROUP APPARATUS FOR A 
DXSK DRIVE ARRAY DATA STORAGE SUBSYSTEM 

FIELD OF THE INVENTION 

This invention relates to data storage subsystems 
5 and, in particular, to an improved facility for 
providing redundant copies of data records for an 
associated host processor. 

PROBLEM 

It is a problem in the field of data storage 

10 subsystems to reliably store data on the data storage 
media in a fault tolerant manner. Peripheral data 
storage subsystems typically use magnetic disk drives 
to store data records thereon for an associated host 
processor. A control unit is used to interconnect the 

15 host processor to a plurality of disk drives. In 
these data storage subsystems, improved data storage 
reliability can be obtained by the use of dual copies, 
wherein duplicate copies of a data record are stored 
on different disk drives within the data storage 

2 0 subsystem. One example of dual copy capability is 
disclosed in U.S. Patent No. 4,837,680, issued June 6, 
1989 to N. Crockett et al . The dual copy feature is 
typically provided in response to the host processor 
transmitting a "define duplex copy group" system 

25 command which designates one of the disk drives as the 
primary data storage device. The host processor also 
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selcctc a secondary data storage device to maintain a 
duplicate ccpy of each data record written by the host 
processor to the primary data storage device. 
Therefore , each data record transmitted by the host 
5 processor to the control unit for storage on the 
prinary data storage device is also written by the 
control unit to the secondary data storage device. 
This ccr.::guration maintains two copies of each data 
record, with the copies being stored on physically 

10 dit !cr/:r.: disk drives behind a single control unit . 

In thr- *vrr.t that one of the disk drives fails, the 
data rccrrrJ is still available to the host processor 
on th< rt^.or disk drive in this duplex copy group. 
This r.rranqeir.ent significantly improves the 

15 relink :l :ty of the data storage subsystem, but doubles 
the cc:t cf storing data because of the need for two 
copi^- c * the data record to be maintained on two 
separate- ciizik drives. 
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SORPTION 

The above described problems are solved and a 
technical advance achieved by the phantom duplex copy 
group apparatus in a disk drive array data storage 
subsystem. This apparatus makes use of a disk drive 
array to store the data records for the associated 
host processor. This disk drive array emulates the 
operation of a large form factor disk drive by using 
a plurality of interconnected small form factor disk 
drives. These small form factor disk drives are 
configured into redundancy groups, each of which 
contains n+m disk drives for storing data records and 
redundancy information thereon. Each redundancy 
group, also called a logical disk drive, is divided 
into a number of logical cylinders, each containing i 
logical tracks, one logical track for each of the i 
physical tracks contained in a cylinder of one 
physical disk drive. Each logical track is comprised 
of n+m physical tracks, one physical track from each 
disk drive in the redundancy group. The n+m disk 
drives are used to store n data segments, one on each 
of n physical tracks per logical track, and- to store 
m redundancy segments, one on each of m physical 
tracks per logical track in the redundancy group. The 
25 n+m disk drives in a redundancy group have 
unsynchronized spindles and loosely coupled actuators. 
The data is transferred to the disk drives via 
independent reads and writes since all disk drives 
operate independently. 

The disk drive array data storage subsystem is a 
dynamically mapped system, and virtual devices are 
defined in the storage control unit contained therein. 
Each virtual device is the image of a disk drive 
presented to the host processor over the channel 
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interface. A virtual device is a host-addressable 
entity with host-controlled content and host-managed 
space allocation. In this system, the virtual device 
consists of a mapping of a large form factor disk 
5 drive image onto a plurality of small form factor disk 
drives which constitute at least one redundancy group 
within the disk drive array. The virtual to physical 
mapping is accomplished by the use of a Virtual Device 
Table (VDT) entry which represents the virtual device. 
10 The "realization" of the virtual device is the set of 
Virtual Track Directory (VTD) entries, associated with 
the VDT entry each of which VTD entries contains data 
indicative of the Virtual Track Instances, which are 
the physical storage locations in the disk drive array 
15 redundancy group that contain the data records. 

The use of this configuration is significantly 
more reliable than a large form factor disk drive. 
However, in order to maintain compatibility with host 
processors that request the duplex copy group feature, 
20 the phantom duplex copy group apparatus of the present 
invention mimics the creation of a duplex copy group 
in this dynamically mapped data storage subsystem 
using a disk array and a phantom set of pointers that 
mimic the data storage devices on which the data 
25 records are stored. In response to the host processor 
requesting the activation of the duplex copy group 
capability and the associated designation of primary 
and secondary disk drives to store the data thereon, 
the apparatus of the present invention implements the 
3 0 host processor request by configuring a pair of 
visual devices to perform as if they were primary and 
secondary large form factor disk drives. 

The use of redundancy groups with their 
associated redundancy data obviates the need for a 
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secondary disk drive to provide data backup as 
requested by the host processor. Therefore, in order 
to maximize the data storage capability of the data 
storage subsystem, a second physical copy of the data 
record is not created within the data storage 
subsystem. Instead, in order to emulate the duplex 
copy group capability of a standard data storage 
subsystems, the present apparatus links together a 
primary and a secondary Virtual Device Table entry xn 
response to the host processor requesting activation 
of the duplex copy group capability. The 
implementation of the primary device consists of a 
Virtual Device Table entry in the storage control unit 
which points to a set of Virtual Track Directory 
entries. These entries in the virtual track directory 
map the track image of the virtual device to physical 
storage locations in at least one selected redundancy 
group in the disk drive array. The secondary data 
storage device designated by the host processor is 
implemented by a Virtual Device Table entry which does 
not contain any associated physical data storage 
capability. Instead, the secondary virtual device 
definition in the storage control unit simply points 
to the primary virtual device definition in the 
storage control unit and contains no virtual track 
directory entries associated therewith independent of 
those assigned to the primary virtual device. In this 
manner, the disk drive array data storage subsystem 
emulates the operation of the duplex copy group 
feature as requested by the host processor yet does 
not require the physical replication of the data 
records in order to provide the reliability and 
availability of the data heretofore provided by the 
two physical copies of the duplex copy group feature 
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in the large form factor disk drive data storage 
subsystems . 
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BPTBff PESCRT PTTON OF DRAWING 

Figure 1 illustrates in block diagram form the 
architecture of the disk drive array data storage 
subsystem; 

5 Figure 2 illustrates the cluster control of the 

data storage subsystem; 

Figure 3 illustrates the disk drive manager of 
the data storage subsystem; 

Figure 4 illustrates the data record mapping for 
10 the phantom duplex copy group operation; 

Figure 5 illustrates the data record mapping for 
the suspended phantom duplex copy group operation; 

Figures 6 and 7 illustrate, in flow diagram form, 
the operational steps taken to perform a data read and 
15 write operation, respectively; 

Figure 8 illustrates a typical free space 
directory used in the data storage subsystem; 

Figure 9 illustrates, in flow diagram form, the 
free space collection process; 
20 Figure 10 illustrates, in flow diagram form, the 

operation of the phantom duplex copy group operation. 
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DETAI LED DESCRIPTTOM OF TOE DRAWTNfl 

The data storage subsystem of the present 
invention uses a plurality of small form factor disk 
drives in place of a single large form factor disk 
5 drive to implement an inexpensive, high performance, 
high reliability disk drive memory that emulates the 
format and capability of large form factor disk 
drives. This system avoids the parity update problem 
of the prior art disk drive array systems by never 
10 updating the parity. Instead, all new or modified 
data is written on empty logical tracks and the old 
data is tagged as obsolete. The resultant "holes" in 
the logical tracks caused by old data are removed by 
a background free-space collection process that 
creates empty logical tracks by collecting valid data 
into previously emptied logical tracks. 

The plurality of disk drives in the disk drive 
array data storage subsystem are configured into a 
plurality of variable size redundancy groups of n+m 
parallel connected disk drives to store data thereon. 
Each redundancy group, also called a logical disk 
drive, is divided into a number of logical cylinders, 
each containing i logical tracks, one logical track 
for each of the i physical tracks contained in a 
25 cylinder of one physical disk drive. Each logical 
track is comprised of n+m physical tracks, one 
physical track from each disk drive in the redundancy 
group. The n+m disk drives are used to store n data 
segments, one on each of n physical tracks per logical 
track, and to store m redundancy segments, one on each 
of m physical tracks per logical track in the 
redundancy group. The n+m disk drives in a redundancy 
group have unsynchronized spindles and loosely coupled 
actuators. The data is transferred to the disk drives 
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via independent reads and writes since all disk drives 
operate independently . 

In addition, a pool of r globally switchable 
backup disk drives is maintained in the data storage 
5 subsystem to automatically substitute a replacement 
disk drive for a disk drive in any redundancy group 
that fails during operation. The pool of r backup 
disk drives provides high reliability at low cost. 
Each physical disk drive is designed so that it can 
10 detect a failure in its operation, which allows the m 
redundancy segments per logical track to be used for 
multi-bit error correction. Identification of the 
failed physical disk drive provides information on the 
bit position of the errors in the logical track and 
15 the redundancy data provides information to correct 
the errors. Once a failed disk drive in a redundancy 
group is identified, a backup disk drive from the 
shared pool of backup disk drives is automatically 
switched in place of the failed disk drive. Control 
circuitry reconstructs the data stored on each 
physical track of the failed disk drive, using the 
remaining n-1 physical tracks of data plus the 
associated m physical tracks containing redundancy 
segments of each logical track. The reconstructed 
25 data is then written onto the substitute disk drive. 

This apparatus makes use of a disk drive array to 
store the data records for the associated host 
processor. This disk drive array emulates the 
operation of a large form factor disk drive by using 
3 0 a plurality of interconnected small form factor disk 
drives. These small form factor disk drives are 
configured into redundancy groups, each of which 
contains n+m disk drives for storing data records and 
redundancy information thereon. Each redundancy 
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group, also called a logical disk drive , is divided 
into a number of logical cylinders, each containing i 
logical tracks , one logical track for each of the i 
physical tracks contained in a cylinder of one 
5 physical disk drive. Each logical track is comprised 
of n+m physical tracks, one physical track from each 
disk drive in the redundancy group. The n+m disk 
drives are used to store n data segments, one on each 
of n physical tracks per logical track, and to store 
10 m redundancy segments, one on each of m physical 
tracks per logical track in the redundancy group. The 
n+m disk drives in a redundancy group have 
unsynchronized spindles and loosely coupled actuators. 
The data is transferred to the disk drives via 
15 independent reads and writes since all disk drives 
operate independently. 

The disk drive array data storage subsystem is a 
dynamically mapped system, and virtual devices are 
defined in the storage control unit contained therein. 

2 0 Each virtual device is the image of a disk drive 

presented to the host processor over the channel 
interface. A virtual device is a host-addressable 
entity with host-controlled content and host-managed 
space allocation. In this system, the virtual device 
25 consists of a mapping of a large form factor disk 
drive image onto a plurality of small form factor disk 
drives which constitute at least one redundancy group 
within the disk drive array. The virtual to physical 
mapping is accomplished by the use of a Virtual Device 

3 0 Table (VDT) entry which represents the virtual device. 

The "realization" of the virtual device is the set of 
Virtual Track Directory (VTD) entries, associated with 
the VDT entry each of which VTD entries contains data 
indicative of the Virtual Track Instances, which are 
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the physical storage locations in the disk drive array 
redundancy group that contain the data records. 

The use of this configuration is significantly 
more reliable than a large form factor disk drive. 
5 However. :n order to maintain compatibility with host 
prccesscrc that request the duplex copy group feature, 
the phantcr. duplex copy group apparatus of the present 
invention, nicies the creation of a duplex copy group 
in thit dynamically mapped data storage subsystem 

10 us inn a J;&> array and a phantom set of pointers that 
minic tr-.c data storage devices on which the data 
records arc ctored. In response to the host processor 
request.-- the activation of the duplex copy group 
capat .;.?> and! the associated designation of primary 

15 an J rrj;r.;iry disk drives to store the data thereon, 
the .TTF -ir-ituc of the present invention implements the 
hoc: i rarer sor request by configuring a pair of 
virtual devices to perform as if they were primary and 
secondary *arge form factor disk drives. 

2 0 Tfr w*«> of redundancy groups with their 

associate*, redundancy data obviates the need for a 
sec or i r / c : sk drive to provide data backup as 
rcqucstc i r> the host processor. Therefore, in order 
to n.ix :*.;.-#• the data storage capability of the data 
25 storage subsystem, a second physical copy of the data 
record is not created within the data storage 
subsystem instead, in order to emulate the duplex 
copy group capability of a standard data storage 
subsystems, the present apparatus links together a 

3 0 primary and a secondary Virtual Device Table entry in 

response to the host processor requesting activation 
of the duplex copy group capability. The 
implementation of the primary device consists of a 
Virtual Device Table entry in the storage control unit 
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which points to a set of Virtual Track Directory 
entries. These entries in the virtual track directory 
map the track image of the virtual device to physical 
storage locations in at least one selected redundancy 
5 group in the disk drive array. The secondary data 
storage device designated by the host processor is 
implemented by a Virtual Device Table entry which does 
not contain any associated physical data storage 
capability. Instead , the secondary virtual device 

10 definition in the storage control unit simply points 
to the primary virtual device definition in the 
storage control unit and contains no virtual track 
directory entries associated therewith independent of 
those assigned to the primary virtual device. In this 

15 manner, the disk drive array data storage subsystem 
emulates the operation of the duplex copy group 
feature as requested by the host processor yet does 
not require the physical replication of the data 
records in order to provide the reliability and 

20 availability of the data heretofore provided by the 
two physical copies of the duplex copy group feature 
in the large form factor disk drive data storage 
subsystems . 

Data Storage Subsystem Architecture 

25 Figure 1 illustrates in block diagram form the 

architecture of the preferred embodiment of the disk 
drive array data storage subsystem 100. The disk 
drive array data storage subsystem 100 appears to the 
associated host processors 11-12 to be a collection of 

3 0 large form factor disk drives with their associated 
storage control, since the architecture of disk drive 
array data storage subsystem 100 is transparent to the 
associated host processors 11-12 . This disk drive 
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array data storage subsystem 100 includes a plurality 
of disk drives (ex 122-1 to 125-r) located in a 
plurality of disk drive subsets 103-1 to 103-i. The 
disk drives 122-1 to 125-r are significantly less 
5 expensive, even while providing disk drives to store 
redundancy information and providing disk drives for 
backup purposes , than the typical 14 inch form factor 
disk drive with an associated backup disk drive. The 
plurality of disk drives 122-1 to 125-r are typically 
10 the -commodity hard disk drives in the 5h inch form 
factor. 

The architecture illustrated in Figure 1 is that 
of a plurality of host processors 11-12 interconnected 
via the respective plurality of data channels 21, 22 - 
15 31, 32, respectively to a data storage subsystem 100 

that provides the backend data storage capacity for 
the host processors 11-12. This basic configuration 
is well known in the data processing art. The data 
storage subsystem 100 includes a control unit 101 that 

2 0 serves to interconnect the subsets of disk drives 103- 

1 to 103-i and their associated drive managers 102-1 
to 102-i with the data channels 21-22, 31-32 that 
interconnect data storage subsystem 100 with the 
plurality of host processors 11 , 12. 
25 Control unit 101 includes typically two cluster 

controls 111, 112 for redundancy purposes. Within a 
cluster control 111 the multipath storage director 
110-0 provides a hardware interface to interconnect 
data channels 21 , 31 to cluster control 111 contained 

3 0 in control unit 101. In this respect, the multipath 

storage director 110-0 provides a hardware interface 
to the associated data channels 21, 31 and provides a 
multiplex function to enable any attached data channel 
ex-21 from any host processor ex-11 to interconnect to 
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a selected cluster control 111 within control unit 
101. The cluster control 111 itself provides a pair 
of storage paths 201-0, 201-1 which function as an 
interface to a plurality of optical fiber backend 
5 channel? 104. In addition, the cluster control 111 
includes a data compression function as well as a data 
routing function that enables cluster control 111 to 
direct trio transfer of data between a selected data 
channel r: and cache memory 113 , and between cache 
10 memory i i and one of the connected optical fiber 
backm : :-h.inncls 104. Control unit 101 provides the 
major c : t .i ntorage subsystem control functions that 
inclw t' tr.c creation and regulation of data redundancy 
groups, reconstruction of data for a failed disk 
15 drive, cw itching a spare disk drive in place of a 
failed J.tK drive, data redundancy generation, logical 
devici z; :cc management, and virtual to logical device 
mapping. Trsese subsystem functions are discussed in 
further detail below. 
20 Jrive manager 102-1 interconnects the 

plur^l:t\- c! commodity disk drives 122-1 to 125-r 
included in disk drive subset 103—1 with the plurality 
of optical fiber backend channels 104. Disk drive 
manager 1C2-1 includes an input/ output circuit. 120 
25 that provides a hardware interface to interconnect the 
optical fiber backend channels 104 with the data paths 
12 6 tr.at serve control and drive circuits 121. 
Control and drive circuits 121 receive the data on 
conduct crs 126 from input/output circuit 120 and 
30 convert the form and format of these signals as 
required by the associated commodity disk drives in 
disk drive subset 103-1. In addition, control and 
drive circuits 121 provide a control signalling 
interface to transfer signals between the disk drive 
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subset 103-1 and control unit 101. The data that 

is written onto the disk drives in disk drive subset 
103-1 consists of data that is transmitted from an 
associated host processor 11 over data channel 21 to 
5 one of cluster controls 111, 112 in control unit 101. 
The data is written into, for example, cluster control 
111 which stores the data in cache 113. Cluster 
control 111 stores n physical tracks of data in cache 
113 and then generates m redundancy segments for error 
10 correction purposes. Cluster control 111 then selects 
a subset of disk drives (122-1 to 122-n+m) to form a 
redundancy group to store the received data. Cluster 
control 111 selects an empty logical track, consisting 
of n+m physical tracks, in the selected redundancy 
15 group. Each of the n physical tracks of the data are 
written onto o: 3 of n disk drives in the selected data 
redundancy group. An additional m disk drives are 
used in the redundancy group to store the m redundancy 
segments. The M redundancy segments include error 
2 0 correction characters and data that can be used to 
verify the integrity of the n physical tracks that are 
stored on the n disk drives as well as to reconstruct 
one or more of the n physical tracks of the data if 
that physical track were lost due to a failure of the 
25 disk drive on which that physical track is stored. 

Thus, data storage subsystem 100 can emulate one 
or more large form factor disk drives (ex - an IBM 
3380K type of disk drive) using a plurality of smaller 
form factor disk drives while providing a high 
reliability capability by writing the data across a 
plurality of the smaller form factor disk drives. A 
reliability improvement is also obtained by providing 
a pool of r backup disk drives (125-1 to 125-r) that 
are switchably inter connect able in place of a failed 
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disk drive. Data reconstruction is accomplished by 
the use of the m redundancy segments, so that the data 
stored on the remaining functioning disk drives 
combined with the redundancy information stored in the 
5 redundancy segments can be used by control software in 
control unit 101 to reconstruct the data lost when one 
or more of the plurality of disk drives in the 
redundancy group fails (122-1 to 122-n+m) . This 
arrangement provides a reliability capability similar 
10 to that obtained by disk shadowing arrangements \h at a 
significantly reduced cost over such an arrangement. 

Disk Drive 

Each of the disk drives 122-1 to 125-r in disk 
drive subset 103-1 can be considered a disk subsystem 

15 that consists of a disk drive mechanism and its 
surrounding control and interface circuitry. The disk 
drive consists of a commodity disk drive which is a 
commercially available hard disk drive of the type 
that typically is used in personal computers. A 

20 control processor associated with the disk drive has 
control responsibility for the entire disk drive and 
monitors all information routed over the various 
serial data channels that connect each disk drive 122- 
1 to 125-r to control and drive circuits 121. Any 

25 data transmitted to the disk drive over these channels 
is stored in a corresponding interface buffer which is 
connected via an associated serial data channel to a 
corresponding serial/parallel converter circuit. A 
disk controller is also provided in each disk drive to 

3 0 implement the low level electrical interface required 
by the commodity disk drive. The commodity disk drive 
has an ESDI interface which must be interfaced with 
control and drive circuits 121. The disk controller 
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provides this function* Disk controller provides 
serialization and deserialization of data, CRC/ECC 
generation, checking and correction and NRZ data 
encoding. The addressing information such as the head 
5 select and other type of control signals are provided 
by control and drive circuits 121 to commodity disk 
drive 122-1. This communication path is also provided 
for diagnostic and control purposes. For example, 
control and drive circuits 121 can power a commodity 
10 disk drive down when the disk drive is in the standby 
mode. In this fashion, commodity disk drive remains 
in an idle state until it is selected by control and 
drive circuits 121. 

Control Unit 

15 Figure 2 illustrates in block diagram form 

additional details of cluster control 111. Multipath 
storage director 110 includes a plurality of channel 
interface units 201-0 to 201-7, each of which 
terminates a corresponding pair of data channels 21, 

20 31. The control and data signals received by the 
corresponding channel interface unit 2 01-0 are output 
on either of the corresponding control and data buses 
206-C, 206-D, or 207-C, 207-D, respectively, to either 
storage path 200-0 or storage path 200-1. Thus, as 

25 can be seen from the structure of the cluster control 
111 illustrated in Figure 2, there is a significant 
amount of symmetry contained therein. Storage path 
200-0 is identical to storage path 2 00-1 and only one 
of these is described herein. The multipath storage 

3 0 director 110 uses two sets of data and control busses 
206-D, C and 207-D, C to interconnect each channel 
interface unit 201-0 to 201-7 with both storage path 
200-0 and 200-1 so that the corresponding data channel 
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21 from the associated host processor 11 can be 
switched via either storage path 200-0 or 200-1 to the 
plurality of optical fiber backend channels 104. 
Within storage path 200-0 is contained a processor 
5 204-0 that regulates the operation of storage path 
200-0. In addition, an optical device interface 205-0 
is provided to convert between the optical fiber 
signalling format of optical fiber backend channels 
104 and the metallic conductors contained within 

10 storage path 200-0. Channel interface control 202-0 
operates under control of processor 204-0 to control 
the flow of data to and from cache memory 113 and- one 
of the channel interface units 201 that is presently 
active with storage path 200-0. The channel interface 

15 control 2 02-0 includes a cyclic redundancy check (CRC) 
generator/ checker to generate and check the CRC bytes 
for the received data. The channel interface circuit 
202-0 also includes a buffer that compensates for 
speed mismatch between the data transmission rate of 

2 0 the data channel 21 and the available data transfer 

capability of the cache memory 113. The data that is 
received by the channel interface control circuit 2 02- 
0 from a corresponding channel interface circuit 201 
is forwarded to the cache memory 113 via channel data 
25 compression circuit 203-0. The channel data 

compression circuit 203-0 provides the necessary 
hardware and microcode to perform compression of the 
channel data for the control unit 101 on a data write 
from the host processor 11. It also performs the 

3 0 necessary decompression operation for control unit 101 

on a data read operation by the host processor 11. 

As can be seen from the architecture illustrated 
in Figure 2, all data transfers between a host 
processor 11 and a redundancy group in the disk drive 
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subsets 103 are routed through cache memory 113. 
Control of cache memory 113 is provided in control 
unit 101 by processor 204-0. The functions provided 
by processor 204-0 include initialization of the cache 

5 directory and other cache data structures, cache 
directory searching and management, cache space 
management, cache performance improvement algorithms 
as well as other cache control functions. In 
addition, processor 204-0 creates the redundancy 

10 groups from the disk drives in disk drive subsets 103 
and maintains records of the status of those devices. 
Processor 204-0 also causes the redundancy data across 
the n data disks in a redundancy group to be generated 
within cache memory 113 and writes the m segments of 

15 redundancy data onto the m redundancy disks in the 
redundancy group. The functional software in 
processor 204-0 also manages the mapping from virtual 
to logical and from logical to physical devices. The 
tables that describe this mapping are updated, 

20 maintained, backed up and occasionally recovered by 
this functional software on processor 204-0. The free 
space collection function is also performed by 
processor 2 04-0 as well as management and scheduling 
of the optical fiber backend channels 104 . Many of 

25 these above functions are well known in the data 
processing art and are not described in any detail 
herein. 

Disk Drive Manager 

Figure 3 illustrates further block diagram detail 
3 0 of disk drive manager 102-1. Input/ output circuit 120 
is shown connecting the plurality of optical fiber 
channels 104 with a number of data and control busses 
that interconnect input/ output circuit 120 with 
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control and drive circuits 121. . Control and drive 
circuits 121 consist of a command and status circuit 
3 01 that monitors and controls the status and command 
interfaces to the control unit 101. Command and 
5 status circuit 301 also collects data from the 
remaining circuits in disk drive managers 102 and the 
various disk drives in disk drive subsets 103 for 
transmission to control unit 101. Control and drive 
circuits 121 also include a plurality of drive 
10 electronics circuits 303, one for each of ?4the 
commodity disk drives that is used in disk drive 
subset 103-1. The drive electronics circuits ^ 3 03 
control the data transfer to and from the associated 
commodity drive via an ESDI interface. The drive 
15 electronics circuit 3 03 is capable of transmitting and 
receiving frames on the serial interface and contains 
a microcontroller, track buffer, status and control 
registers and industry standard commodity drive 
interface. The drive electronics circuit 3 03 receives 
20 data from the input/ output circuit 120 via an 
associated data bus 3 04 and control signals via 
control leads 3 05. Control and drive circuits 121 
also include a plurality of subsystem circuits 3 02-1 
to 302-j, each of which controls a plurality of drive 
25 electronics circuits 3 03. The subsystem circuit 3 02 
controls the request , error and spin up lines for each 
drive electronics circuit 3 03. Typically, a subsystem 
circuit 3 02 interfaces with thirty- two drive 
electronics circuits 3 03. The subsystem circuit 3 02 
30 also functions to collect environmental sense 
information for transmission to control unit 101 via 
command and status circuit 3 01. Thus, the control and 
drive circuits 121 in disk drive manager 102-1 perform 
the data and control signal interface and transmission 
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function between the commodity disk drives of disk 
drive subset 103-1 and control unit 101 ♦ 

Disk Drive Malfunction 

The control unit 101 determines whether an 
5 individual disk drive in the redundancy group it is 
addressing has malfunctioned. The control unit 101 
that has detected a bad disk drive transmits a control 
message to disk drive manager 102-1 over the 
corresponding control signal lead to indicate that a 

10 disk drive has failed. When the need for a spare disk 
drive is detected by the control unit 101, the faulty 
disk drive is taken out of service and a spare disk 
drive 125-1 is activated from the spare pool of r disk 
drives (125-1 to 125-r) by the disk drive manager 102- 

15 1, at the request of control unit 101. This is 
accomplished by rewriting the configuration definition 
of that redundancy group that contained the bad disk 
drive. The new selected disk drive 125-1 in the 
redundancy group (122-1 to 122-n+m) is identified by 

20 control signals which are transmitted to all of 
cluster control 111-112. This insures that the system 
mapping information stored in each of cluster controls 
111-112 is kept up to date. 

Once the new disk drive (125-1) is added to the 

25 redundancy group (122-1 to 122-n+m) , it is tested and, 
if found to be operating properly, it replaces the 
failed disk drive in the system mapping tables. The 
control unit 101 that requested the spare disk drive 
(125-1) reconstructs the data for the new disk drive 

30 (125-1) using the remaining n-1 operational data disk 
drives and the available redundancy information from 
the m redundancy disk drives. Before reconstruction 
is complete on the disk, data is still available to 
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the host processors 11 , 12, although it must be 
reconstructed on line rather than just reading it from 
a single disk. When this data reconstruction 
operation is complete, the reconstructed segments are 
5 written on the replacement disk drive (125-1) and the 
redundancy group is again fully operational. 

This dynamically reconf igurable attribute of the 
data storage subsystem 100 enables this system to be 
very robust. In addition, the dynamically 

10 configurable aspect of the communication path between 
the cluster controls 111, 112 and the disk drives 
(122-1) permits the architecture to be very flexible. 
With the same physical disk drive subset (103-1) , the 
user can implement a disk drive memory that has a high 
15 data storage capacity and which requires shorter 
periodic repair intervals, or a disk drive memory that 
has a lower data storage capacity with longer required 
repair intervals simply by changing the number of 
active disk drives in each redundancy group. In 
20 addition, the disk drive memory has the ability to 
detect new spare disk drives 123 when they are plugged 
in to the system thereby enabling the disk drive 
memory to grow as the storage or reliability needs 
change without having to reprogram the disk drive 
25 memory control software. 

Dynamic Virtual Device to Logical Device Mapping 

With respect to data transfer operations, all 
data transfers go through cache memory 113 . 
Therefore, front end or channel transfer operations 
3 0 are completely independent of backend or device 
transfer operations. In this system, staging 

operations are similar to staging in other cached disk 
subsystems but destaging transfers are collected into 
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groups for bulk transfers. In addition, this data 
storage subsystem 100 simultaneously performs free 
space collection, mapping table backup, and error 
recovery as background processes. Because of the 
conplete front end/backend separation, the data 
storage subsystem 100 is liberated from the exacting 
processor t icing dependencies of previous count key 
data dick cubsystems. The subsystem is free to 
dedicate its processing resources to increasing 
pcriorcancc. through lore intelligent scheduling and 
data trinsfcr control. 

Tr-.c dish drive array data storage subsystem 100 
cons:tts three abstract layers: virtual, logical 
and physical. The virtual layer functions as a 
15 conventional large form factor disk drive memory. The 
lor,-..-.,: :*yrr functions as an array of storage units 
that are grouped into a plurality of redundancy groups 
(ex to 122-n+m) , each containing n+m disk drives 

to cure n physical tracks of data and m physical 
20 trac» t : redundancy information for each logical 
trao.. The physical layer functions as a plurality of 
indivi.iu.il c&all form factor disk drives. The data 
stouK- cir.-igement system operates to effectuate the 
sappirv) c? data among these abstract layers and to 
25 control the allocation and management of the actual 
space or the physical devices. These data storage 
managesent functions are performed in a manner that 
renders the operation of the disk drive array data 
storage suosystem 100 transparent to the host 
.30 processors (11-12). 

A redundancy group consists of n+m disk drives. 
The redundancy group is also called a logical volume 
or a logical device. Within each logical device there 
are a plurality of logical tracks, each of which is 



BNSDOCiD- <WO P??2955Ai »■> 



the set of all physical tracks in the redundancy group 
which have the same physical track address. These 
logical tracks are also organized into logical 
cylinders, each of which is the collection of all 
logical tracks within a redundancy group which can be 
accessed at a common logical actuator position. A 
disk drive array data storage subsystem 100 appears to 
the host processor to be a collection of large form 
factor disk drives , each of which contains a 
predetermined number of tracks of a predetermined size 
called a virtual track. Therefore, when the host 
processor 11 transmits data over the data channel 21 
to the data storage subsystem 100, the data is 
transmitted in the form of the individual records of 
a virtual track. In order to render the operation of 
the disk drive array data storage subsystem 100 
transparent to the host processor 11, the received 
data is stored on the actual physical disk drives 
(122-1 to 122-n+m) in the form of virtual track 
instances which reflect the capacity of a track on the 
large form factor disk drive that is emulated by data 
storage subsystem 100. Although a virtual track 
instance may spill over from one physical track to the 
next physical track, a virtual track instance is -not 
permitted to spill over from one logical cylinder to 
another. This is done in order to simplify the 
management of the memory space. 

When a virtual track is modified by the host 
processor 11, the updated instance of the virtual 
track is not rewritten in data storage subsystem 100 
at its original location but is instead written to a 
new logical cylinder and the previous instance of the 
virtual track is marked obsolete. Therefore, over 
time a logical cylinder becomes riddled with "holes" 
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of obsolete data known as free space. In order to 
create whole free logical cylinders, virtual track 
instances that are still valid and located among 
fragmented free space within a logical cylinder are 
5 relocated within the disk drive array data storage 
subsystem 100 in order to create entirely free logical 
cylinders. In order to evenly distribute data 
transfer activity , the tracks of each virtual device 
are scattered as uniformly as possible among the 

10 logical devices in the disk drive array data storage 
subsystem: 100.: In addition, virtual track instances 
are padded out if necessary to fit into an integral 
number of physical device sectors. This is to insure 
that each virtual track instance starts on a sector 

15 boundary of the physical device. 

Mapping Tables 

It is necessary to accurately record the location 
of all data within the disk drive array data storage 
subsystem 100 since the data received from the host 

2 0 processors 11-12 is mapped from its address in the 

virtual space to a physical location in the subsystem 
in a dynamic fashion. A virtual track directory is 
maintained to recall the location of the current 
instance of each virtual track in the disk drive array 
25 data storage subsystem 100. The virtual track 
directory consists of an entry for each virtual track 
which the associated host processor 11 can address. 
The entry contains the logical sector address at which 
the virtual track instance begins. The virtual track 

3 0 directory entry also contains data indicative of the 

length of the virtual track instance in sectors. The 
virtual track directory is stored in noncontiguous 
pieces of the cache memory 113 and is addressed 
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indirectly through pointers in a virtual device -table. 
The virtual track directory is updated whenever a new 
virtual track instance is written to the disk drives. 
The storage control also includes a free space 
5 directory (Figure 8) which is a list of all of the 
logical cyl inders in the disk drive array data storage 
subsyr>trr ioo ordered by logical device. Each logical 
device i c cataloged in a list called a free space list 
for the iocical device; each list entry represents a 
10 logical cylinder and indicates the amount of free 
space that this logical cylinder presently contains. 
This frc* i.;Mce directory contains a positional entry 
for cac.v lexical cylinder; each entry includes both 
forvar: a:..: backward pointers for the doubly linked 
15 free list for its logical device and the number 

of !.mm Letters contained in the logical cylinder. 
Each :! tr.rsc pointers points either to another entry 
in the :rcc space list for its logical device or is 
null. Tr.c collection of free space is a background 
20 proccsr that is implemented in the disk drive array 
data etcraac- subsystem 100. The free space collection 
proccsr wkes use of the logical cylinder directory 
which is a list contained in the first sector of each 
logics] cylinder indicative of the contents of that 
25 logical cylinder. The logical cylinder directory 
contains an entry for each virtual track instance 
contained within the logical cylinder. The entry for 
each virtual track instance contains the identifier of 
the virtual track instance and the identifier of the 
30 relative sector within the logical cylinder in which 
the virtual track instance begins. From this 
directory and the virtual track directory, the free 
space collection process can determine which virtual 
track instances are still current in this logical 
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cylinder and therefore need to be moved to another 
location to make the logical cylinder available for 
writing new data. 

nata Read Operation 

5 Figure 6 illustrates in flow diagram form the 

operational steps taken by processor 204 in control 
unit 101 of the data storage subsystem 100 to read 
data from a data redundancy group 122-1 to 122-n+m in 
the disk drive subsets 103. The disk drive array data 
10 storage subsystem 100 supports reads of any size. 
However, the logical layer only supports reads of 
virtual track instances. In order to perform a read 
operation, the virtual track instance that contains 
the data to be read is staged from the logical layer 
15 into the cache memory 113. The data record is then 
transferred from the cache memory 113 and any clean up 
is performed to complete the read operation. 

At step 601, the control unit 101 prepares to 
read a record from a virtual track. At step 602, the 
20 control unit 101 branches to the cache directory 
search subroutine to assure that the virtual track is 
located in the cache memory 113 since the virtual 
track may already have been staged into the cache 
memory 113 and stored therein in addition to having a 
25 copy stored on the plurality of disk drives (122-1 to 
122-n+m) that constitute the redundancy group in which 
the virtual track is stored. At step 603, the control 
unit 101 scans the hash table directory of the cache 
memory 113 to determine whether the requested virtual 
3 0 track is located in the cache memory 113. If it is, 
at step 604 control returns back to the main read 
operation routine and the cache staging subroutine 
that constitutes steps 605-616 is terminated. 
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Assume, for the purpose of this description, that 
the virtual track that has been requested is not 
located in the cache memory 113. Processing proceeds 
to step 605 where the control unit 101 looks up the 
5 address of the virtual track in the virtual to logical 
map table. At step 606, the logical map location is 
used to map the logical device to one or more physical 
devices in the redundancy group. At step 607, the 
control unit 101 schedules one or more physical read 
10 operations to retrieve the virtual track instance from 
appropriate ones of identified physical devices 122-1 
to 122-n+m. At step 608, the control unit 101 clears 
errors for these operations. At step 609, a 
determination is made whether all the reads have been 
15 completed, since the requested virtual track instance 
may be stored on more than one of the N+M disk drives 
in a redundancy group. If all of the reads have not 
been completed, processing proceeds to step 614 where 
the control unit 101 waits for the next completion of 
20 a read operation by one of thie N+M disk drives in the 
redundancy group. At step 615 the next reading disk 
drive has completed its operation and a determination 
is made whether there are any errors in the read 
operation that has just been completed. If there are 
25 errors, at step 616 the errors are marked and control 
proceeds back to the beginning of step 609 where a 
determination is made whether all the reads have been 
completed. If at this point all the reads have been 
completed and all portions of the virtual track 
30 instance have been retrieved from the redundancy 
group, then processing proceeds to step 610 where a 
determination is made whether there are any errors in 
the reads that have been completed. If errors are 
detected then at step 611 a determination is made 
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whether the errors can be fixed. One error correctxon 
method is the use of a Reed-Solomon error 
detection/correction code to recreate the data that 
cannot be read directly. If the errors cannot be 
5 repaired then a flag is set to indicate to the control 
unit 101 that the virtual track instance can not be 
read accurately. If the errors can be fixed, then in 
step 612 the identified errors are corrected and 
processing returns back to the main routine at step 
10 604 where a successful read of the virtual track 
instance from the redundancy group to the cache memory 
113 has been completed. 

At step 617, control unit 101 transfers the 
requested data record from the staged virtual track 
15 instance in which it is presently stored. Once the 
records of interest from the staged virtual track have 
been transferred to the host processor 11 that 
requested this information, then at step 618 the 
control unit 101 cleans up the read operation by 
2 0 performing the administrative tasks necessary to place 
all of the apparatus required to stage the virtual 
track instance from the redundancy group to the cache 
memory 113 into an idle state and control returns at 
step 619 to service the next operation that is 
25 requested. 

Data Wri *-«» operation 

Figure 7 illustrates in flow diagram form the 
operational steps taken by the disk drive array data 
storage subsystem 100 to perform a data write 
3 0 operation. The disk drive array data storage 
subsystem 100 supports writes of any size, but again, 
the logical layer only supports writes of virtual 
track instances. Therefore in order to perform a 
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write operation, the virtual track that contains the 
data record to be rewritten is staged from the logical 
layer into the cache memory 113. Once the write 
operation is complete, the location of the obsolete 
5 instance of the virtual track is marked as free space. 
The modified data record is then transferred into the 
virtual track and this updated virtual track instance 
is then scheduled to be written from the cache memory 
113 where the data record modification has taken place 
10 into the logical layer. Any clean up of the write 
operation is then performed once this transfer and 
write is completed. 

At step 701, the control unit 101 performs the 
set up for a write operation and at step 702, as with 
the read operation described above, the control unit 
101 branches to the cache directory search subroutine 
to assure that the virtual track into which the data 
is to be transferred is located in the cache memory 
113. since all of the data updating is performed in 
the cache memory 113, the virtual track in which this 
data is to be written must be transferred from the 
redundancy group in which it is stored to the cache 
memory 113 if it is not already resident in the cache 
memory 113. The transfer of the reguested virtual 
track instance to the cache memory 113 is performed 
for a write operation as it is described above with 
respect to a data read operation and constitutes steps 
603-616 illustrated in Figure 6 above. 

At step 703, the control unit 101 marks the 
3 0 virtual track instance that is stored in the 
redundancy group as invalid in order to assure that 
the logical location at which this virtual track 
instance is stored is not accessed in response to 
another host processor 12 attempting to read or write 
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the same virtual track. Since the modified record 
data is to be written into this virtual track in the 
cache memory 113, the copy of the virtual track that 
resides in the redundancy group is now inaccurate and 
5 must be removed from access by the host processors 11- 
12. At step 704, the control unit 101 transfers the 
modified record data received from host processor 11 
into the virtual track that has been retrieved from 
the redundancy group into the cache memory 113 to 

10 thereby merge this modified record data into the 
original virtual track instance that was retrieved 
from the redundancy group. Once this merge has been 
completed and the virtual track now is updated with 
the modified record data received from host processor 

15 11, the control unit 101 must schedule this updated 
virtual track instance to be written onto a redundancy 
group somewhere in the disk drive array data storage 

subsystem 100. 

This scheduling is accomplished by the subroutine 

20 that consists of steps 706-711. At step 706, the 
control unit 101 determines whether the virtual track 
instance as updated fits into an available open 
logical cylinder. If it does not fit into an 
available open logical cylinder, then at step 707 then 

25 this .presently open logical cylinder must be closed 
out and written to the physical layer and another 
logical cylinder selected from the most free logical 
device or redundancy group in the disk drive array 
data storage subsystem 100. At step 708, the 

3 0 selection of a free logical cylinder from the most 
free logical device takes place. This ensures that 
the data files received from host processor 11 are 
distributed across the plurality of redundancy groups 
in the disk drive array data storage subsystem 100 in 
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an even manner to avoid overloading certain redundancy 
groups while underloading other redundancy groups. 
Once a free logical cylinder is available, either 
being the presently open logical cylinder or a newly 
5 selected logical cylinder, then at step 709, the 
control unit 101 writes the updated virtual track 
instance into the logical cylinder and at step 710 the 
new location of the virtual track is placed in the 
virtual to logical map in order to render it available 

10 to the host processors 11-12. At step 711, control 
returns to the main routine, where at step 712 -the 
control unit 101 cleans up the remaining 
administrative tasks to complete the write operation 
and return to an available state at 712 for further 

15 read or write operations from host processor 11. 

Data Move/Copy Operation 

The data file move/copy operation instantaneously 
relocates or creates a second instance of a selected 
data file by merely generating a new set of pointers 
20 to reference the same physical memory location as the 
original set of reference pointers in the virtual 
track directory. In this fashion, by simply 
generating a new set of pointers referencing the same 
physical memory space, the data file can;, be 
25 moved/copied. 

This apparatus instantaneously moves the original 
data file without the time penalty of having to 
download the data file to the cache memory 113 and 
write the data file to a new physical memory location. 
30 For the purpose of enabling a program to simply access 
the data file at a different virtual address the use 
of this mechanism provides a significant time 
advantage. A physical copy of the original data 



record can later be written as a background process to 
a second memory location, if so desired. 
Alternatively, when one of the programs that can 
access the data file writes data to or modifies the 
data file in any way, the modified copy of a portion 
of the original data file is written to a new physical 
memory location and the corresponding address pointers 
are changed to reflect the new location of this 
rewritten portion of the data file. In this 

fashion, a; data file can be instantaneously 
moved/copied by simply creating a new set of memory 
pointers and the actual physical copying of the data 
file can take place either as a background process or 
incrementally as necessary when each virtual track of 
the data file is modified by one of the programs that 
accesses the data file. 

Virtual Track Directory Source and Target Flags 

Each entry in the Virtual Track Directory (VTD) 
contains two flags associated with the Copy/Move 
function. The "Source" flag is set whenever a virtual 
Track Instance at this Virtual Track Address has been 
the origin of a copy or move. The Virtual Track 
Instance pointed to by this entry is not necessarily 
the Source, but the Virtual Track Instance contains 
this Virtual Address. If the Source flag is set, 
there is at least one entry in the Copy Table for this 
Virtual Address. The "Target" flag is set whenever a 
Virtual Track Instance contains data that has been the 
destination of a copy or move. If the Target flag is 
set, the Virtual Address in the Virtual Track Instance 
that is pointed to is not that of the VTD Entry. 
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The format of the Copy Table is illustrated here 

graphically. The preferred implementation is to have 

a separate Copy Table for each Logical Device so that 

5 there is a Copy Table head and tail pointer associated 

with each Logical Device; however, the table could 

just as easily be implemented as a single table for 

the entire subsystem. The table is ordered such that 

the sources are in ascending Logical Address order. 

10 COPY TABLE SOURCE HEAD POINTER ;>r 

I 

SOURCE — > TARGET — > TARGET : -« 
I 

SOURCE — > TARGET 
15 4 

SOURCE > TARGET — > TARGET — > TARGET 

r 

COPY TABLE SOURCE TAIL POINTER 

The table is a singly linked list of Sources where 

20 each Source is the head of a linked list of Targets. 

The Source Entry contains the following: 

Logical Address (VTD Entry Copy) 
Virtual Address 

Next Source Pointer (NULL if last Source in 
25 list) 

Target Pointer 

The Target Entry contains the following: 

Virtual Address 

Next Target Pointer (NULL if last Target in 
3 0 list) 

Update Count Fields Flag ^ 



Free Space Collection 

When data in cache memory 113 is modified, it 
cannot be written back to its previous location on a 
35 disk drive in disk drive subsets 103 since that would 
invalidate the redundancy information on that logical 
track for the redundancy group. Therefore, once a 
virtual track has been updated, that track must be 
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written to a new location in the data storage 
subsystem 100 and the data in the previous location 
must be marked as free space. Therefore , in each 
redundancy group , the logical cylinders become riddled 
5 with "holes" of obsolete data in the form of virtual 
track instances that are marked as obsolete. In order 
to completely empty logical cylinders for destaging, 
the valid data in partially valid cylinders must be 
read into cache memory 113 and rewritten into new 

10 previously emptied logical cylinders. This process is 
called free space collection. The free space 
collection function is accomplished by control unit 
101. Control unit 101 selects a logical cylinder that 
needs to be collected as a function of how much free 

15 space it contains. The free space determination is 
based on the free space directory as illustrated in 
Figure 8, which indicates the availability of unused 
memory in data storage subsystem 100. The table 
illustrated in Figure 8 is a listing of all of the 

2 0 logical devices contained in data storage subsystem 

100 and the identification of each of the logical 
cylinders contained therein. The entries in this 
chart represent the number of free physical sectors in 
this particular logical cylinder. A write cursor is 

25 maintained in memory and this write cursor indicates 
the available open logical cylinder that control unit 

101 will write to when data is destaged from cache 113 
after modification by associated host processor 11-12 
or as part of a free space collection process. In 

3 0 addition , a free space collection cursor is maintained 

which points to the present logical cylinder that is 
being cleared as part of a free space collection 
process. Therefore, control unit 101 can review the 
free space directory illustrated in Figure 8 as a 



WO 92/22865 



PCT/US92/03653 



backend process to determine which logical cylinder on 
a logical device would most benefit from free space 
collection. Control unit 101 activates the free space 
collection process by reading all of the valid data 
5 from the selected logical cylinder into cache memory 
113. The logical cylinder is then listed as 
completely empty, since all of the virtual track 
instances therein are tagged as obsolete. Additional 
logical cylinders are collected for free space 

10 collection purposes or as data is received frbifi' an 
associated host processor 11-12 until a complete 
logical cylinder has been filled. Once a complete 
logical cylinder has been filled, a new previously 
emptied logical cylinder is chosen. 

15 Figure 9 illustrates in flow diagram form the 

operational steps taken by processor 204 to implement 
the free space collection process. The use of Source 
and Target Flags is necessitated by the free space 
collection process since this process must determine 

2 0 whether each virtual track instance contains valid or 

obsolete data. In addition, the free space collection 
process 'performs the move/copy count field adjustment 
operations listed in the copy table. The basic 
process is initiated at step 901 when processor 2 04 
25 selects a logical cylinder for collection based on* the 
number of free logical sectors as listed in the table 
of Figure 8. Processor 204 checks each virtual track 
directory entry to determine if the Source Flag is 
set. If not, the process exits at step 909 to the 

3 0 next logical track. If the Source Flag is set, at 

step 9 02 processor 204 scans the source list to find 
the logical address in the logical cylinder directory. 
If no address is found, this virtual track instance is 
an obsolete version and is no longer needed (invalid) . 



This data is not relocated. 

If the address is found, at step 904, processor 
204 compares the logical cylinder directory logical 
address with the virtual track directory entry logical 
address. If there is a match, processor 204 creates 
a logical cylinder directory entry for this virtual 
track instance. If there is not a match, the Source 
has been updated and exists elsewhere. Processor 204 
at step 906 updates the virtual track instance 
descriptor to remove the source virtual address. Upon 
completion of either step 905 or 906, processor 204 at 
step 907 for all Targets in this Source's Target List 
updates the virtual track instance descriptor to 
include this virtual address and the update count 
fields flag from the Copy Table. In addition, 
processor 204 creates a logical cylinder directory 
entry for this virtual track instance. Finally, 
processor 204 updates the virtual track directory 
entry for the Target to point to the new location and 
to clear the Target Flag. Processor 204 at step 908 
removes this Source and all its Targets from the Copy 
Table. Processor 204 also scans the Copy Table for 
Sources with the same virtual address and clears the 
Source Flag. The changes are then journaled to the 
virtual track directory and to the Copy Table. 

Duplex Copy Group Capability Emulation 

Figures 4 and 5 illustrate in block diagram form 
the data structures used to provide duplex copy group 
capability while Figure 10 illustrates in flow 
diagram form the operational steps taken by the data 
storage subsystem to provide the duplex copy group 
capability. In addition to transmitting data records 
to the data storage subsystem 100 for storage therein, 



the hort processor transmits channel commands which 
are instructions to the data storage subsystem 100 to 
control tnc address at which the data records are 
stored and to designate the mode of operation of data 
storage cuDcystem 100. These channel commands are 
well Kr.vmT. in the art and are not disclosed in any 
deta i I he rc in. 

one capability presently found in data storage 
subnynter^.. cuch as IBM's 3990 Storage Control Unit 
(as act.. : :u«J in the IBM publication titled "IBM 3990 
Stora^t rrrtrol Reference" reference no. GA32-0099-3) , 
is tht J;j; :cx copy group capability. As noted above, 
in crdcr t: improve the reliability of data storage on 
the j: :vcs, the host processor can designate two 

dink cr... z, connected to a single control unit as a 
duple x \ * . r . wnerein a data record stored on a primary 
dick ci::vt :n the duplex pair is also concurrently 
store;] t v t^c storage control unit on the secondary 
dick c2r:vc of the duplex pair. In this manner/ 
duplicate c epics are kept of each data record stored 
in the ::t * storage subsystem. 

7r.<* r.rj-t processor 11 activates this feature by 
trar.sc :tt ir.-i channel commands to the storage control 
unit ic: ^-cignating the primary and secondary disk 
drives tc tc used in a duplex pair configuration. 
This procczz ic initiated at step 1001 on Figures 10, 
wherein the host processor 11 transmits a "create 
duplex copy group" channel command to data storage 
subcystcr 100, which channel command designates the 
primary and secondary disk drives. Data storage 
subsystem 100 is a dynamically mapped virtual device 
data storage system. Therefore, the disk drive 
devices designated by the host processor 11 do not in 
reality exist in the form that is understood by the 
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host processor 11. In particular, the data storage 
subsystem 100 makes use of a plurality of small disk 
drives interconnected into redundancy groups to 
emulate the operation of large form factor disk 
5 drives* The host processor 11 , in designating a 
primary storage device, designates what appears to be 
a large form factor disk drive but which is reality 
consists of portions of at least one redundancy group 
in the disk drive array 103 of the data storage 

10 subsystem 100.. As noted above, this emulation is 
accomplished through the use of mapping tables which 
map the virtual image of the emulated device to 
physical storage locations on the small form factor 
disk drives in the redundancy group. 

15 This is illustrated schematically in Figure 4 

wherein host processor 11 defines at step 1002 (Figure 
10) a duplex copy group which includes primary and 
secondary data storage devices. Control unit 101 
responds to this command by creating a copy group 

20 descriptor 400 entry in cache memory 113 which 
contains pointers 431, 432 that designate the virtual 
devices 401 and 4 02 as defined in the Virtual Device 
Table entries in control unit 101 of the data storage 
subsystem 100. The mapping in control unit 101 is 

25 performed by an available processor 204 in one of 
storage paths 200 in one of cluster controls 111, 112. 
The mapping tables are stored in shared memory in 
cache 113 and are available to all processors 204 in 
control unit 101. This virtual device 4 01 defined by 

3 0 the Virtual Device Table entry in control unit 101 
maps to a set of Virtual Track Directory entries 411 
in the virtual track directory 410 that is maintained 
by control unit 101 in cache memory 113. These 
Virtual Track Directory entries 411 contain data 
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indicative of the mapping of the virtual track as 
defined by control unit 101 , to the Virtual Track 
Instances, which are the actual physical storage 
locations in the redundancy groups 421-428 which 
5 contain the data records for that defined virtual 
track. The mapping information therefore represents 
pointers 434 which point to the physical storage 
locations 421-428. In response to the host processor 
11 designating a primary data storage device, control 

10 unit 101 of the data storage subsystem 100 assignsrthe 
primary virtual data storage device 401 and a 
plurality of virtual track directory entries -411 
associated with this virtual data storage device 401. 
The host processor 11 also designates a secondary data 

15 storage device which is paired with the primary data 
storage device for storing the backup or duplicate 
copies of the data records stored in the primary data 
storage device. The disk drive array architecture of 
data storage subsystem 100 obviates the need for 

20 maintaining a second physical copy of the data record 
that is stored in the primary virtual data storage 
device 401. However, in order to be responsive to the 
commands transmitted by host processor 11, the control 
unit 101 of data storage subsystem 100 at step 1003 

25 emulates the secondary data storage device 402. by 
assigning a secondary virtual data storage device 
which simply consists of data indicative of the 
location of the primary virtual storage device 4 01. 
The primary virtual data storage device 401 is itself 

3 0 simply a pointer to a set of entries 411 in a mapping 
table and the secondary virtual data storage device 
402 is therefore a simple pointer 437 pointing to this 
table of data entries 411 via the primary virtual data 
storage device 401. There is no physical storage 
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associated with the secondary virtual data storage 
device 402 and therefore no virtual track directory 
entries are assigned to the secondary virtual data 
storage device 402. The secondary virtual data 
5 storage device 402 shares the realization of primary 
virtual data storage device 401 by referencing the 
Virtual Track Directory entries 411 and the Virtual 
Track instances to which they point. Using this 
architecture, the host processor 11 can access either 

10 the primary 401 or the secondary 402 virtual data 
storage -device in the. conventional manner since access 
to the secondary virtual data storage device 402 is 
processed by data storage subsystem 100 by simply 
redirecting the request to the primary virtual data 

15 storage device 401 as defined by the control unit 101. 

This architecture has significant advantages over 
the conventional duplex copy group operation since, in 
the prior art, data records written to one data 
storage device of the duplex pair requires a second 

20 data write operation to the associated other storage 
device of the duplex pair. The necessity to write two 
copies of the data record on disk drives represents a 
processing burden on the typical storage control unit 
since it takes twice as much time for the storage 

25 control unit to write the dual copies as opposed to 
writing a single copy into the disk drives. The 
control unit 101 of the present apparatus simply 
writes one copy of the data record in the redundancy 
groups designated by the virtual track directory entry 

3 0 411 for this virtual data storage device. No 
additional overhead is required to provide the duplex 
copy group operation since there is a single shared 
realization of the two virtual data storage devices 
401, 402. 
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Alternatively, the correspondence between the 
received data records and the identity of the disk 
drives in the selected redundancy group on which they 
are stored can be accomplished by maintaining two 
5 virtual Track Directory entries 411, 414 in the 
virtual track directory 410 , each of which contains 
identical data indicative of the mapping of the 
virtual track, as defined by control unit 101 , to the 
Virtual Track Instances in redundancy groups 421-428. 
10 This is illustrated schematically in Figure 5 by; the 
set of pointers 434 , 435 associated with each of the 
virtual track directory entries 411 , 414 indicative of 
two identical copies of the data records. This 
configuration also conserves physical space in the 
15 redundancy groups but requires additional Virtual 
Track Directory entries in comparison to the 
implementation previously discussed. 

Suspend Duplex Copy Gro up operation 

The host processor 11 can suspend the duplex copy 
group operation and require that the two disk drives 
operate independent of each other. The host processor 
11 terminates the duplex copy group operation by 
transmitting a "suspend duplex copy group" channel 
command to data storage subsystem 100 at step 1004. 
Since there is only one physical copy of the /-data 
records in data storage subsystem 100 and only one set 
of pointers that map the primary and secondary virtual 
data storage devices to the shared set of physical 
data storage locations , the data storage subsystem 100 
must create a second realization of the shared virtual 
data storage device since the host processor 11 can 
write data to either of these data storage devices 
independent of the other. In order to accomplish 



25 
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this, the storage control unit 101 simply replicates 
at step 1005 the virtual track directory entries 411 
associated with the primary virtual data storage 
device 401 and assigns these new virtual track 
directory entries 414 to the secondary virtual data 
storage device 402 that was assigned by the control 
unit 101 • This step of replication can also be 
implemented via the copy operation described above 
wherein a pointer 438 to the primary virtual track 
directory entries 411 from secondary virtual data 
storage device 402 is used to instantaneously copy the 
directory entries 411. 

Figure 5 illustrates schematically the result of 
the first noted copy operation. Each virtual data 
storage device 401, 402 is defined and represents a 
large form factor disk drive to the host processor 11. 
Each virtual data storage device 401 f 402 has a set of 
virtual track directory entries 411, 414 associated 
therewith, which entries map the virtual track of an 
emulated large form factor disk drive to the actual 
physical storage locations in the redundancy groups 
421-428 wherein the data records for that track are 
stored. At the moment the host processor 11 suspends 
duplex -copy group operation, the data records stored 
in the primary virtual data storage device 401 are 
identical to the data records stored in the secondary 
virtual data storage device 402 since the virtual 
track directory entries 411, 414 associated with both 
of these devices are identical, the pointers contained 
therein are identical and point to the same physical 
data records stored in the redundancy groups 421-428. 
Therefore, even though a second set of Virtual Track 
Directory entries 414 are created, there is still a 
partial shared realization of the primary virtual data 



storage device since the Virtual Track Instances on 
the disk drives 421-428 in the redundancy group are 
shared by both primary 401 and secondary 402 virtual 
data storage devices. 

This is illustrated schematically in Figure 5 by 
the set of pointers 434 , 435 associated with each of 
the virtual track directory entries 411, 414 
indicative of the two identical copies of the data 
records. As the host processor 11 writes data to one 
or the other of these virtual data storage devices , 
the corresponding virtual track directory entries 411 , 
414 are updated. Since, as noted above, data records 
are never updated in place, any changes made thereto 
does not modify the original data record stored in the 
redundancy groups 421-428 but instead creates a new 
data record which is stored in a new physical location 
either within the same redundancy group or in another 
redundancy group. Therefore, over time, the data 
storage subsystem 100 migrates toward two separate 
realizations of the two virtual data storage devices 
as the host processor 11 writes new data or updates 
data records stored in the virtual data storage 
devices 401, 402. The two devices increasingly 
contain different entries in the virtual track 
directories 411, 414 and point to different physical 
locations in the redundancy groups 421-428 where-: the 
data records are stored. 

Reinstate Duplex Copy Group Operation 

The host processor 11 can reinstate the duplex 
copy group operation by transmitting at step 1006 a 
"re-establish duplex copy group" channel command to 
the data storage subsystem 100 indicating which of the 
two data storage devices are to be saved and 
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designated a primary data storage device. In response 
to the re-establish duplex copy group channel command 
received by the data storage subsystem 100 from the 
host processor 11 , the data storage subsystem 100 at 
5 step 1007 simply erases the virtual track directory 
entries 414 associated with the virtual data storage 
device 402 that the host processor 11 has indicated 
should be deleted. The remaining virtual data storage 
device 401 is now the primary data storage device and 

10 a secondary virtual data storage device is implemented 
(Figure 4) as noted above by simply linking the 
Virtual Device Table entry 402 in control unit 101 
with pointer 437 to the primary virtual data storage 
device 401. Therefore, the data storage subsystem 100 

15 can re-establish a duplex copy group operation in a 
fraction of the time typically required of a data 
storage system since this operation represents the 
manipulation of a few pointers as opposed to the 
complete replication of all of the data records stored 

2 0 on the primary data storage device into a secondary 

data storage device defined by the host processor. 

Terminate Duplex Copy Group 

The host processor 11 can terminate the duplex 
copy group operation and require that the two disk 
25 drives operate independent of each other. The host 
processor 11 terminates the duplex copy group 
operation by transmitting a "terminate duplex copy 
group" channel command to data storage subsystem 100 
at step 1008. If the duplex copy group is in a 

3 0 suspended state, as a result of the actions of data 

storage subsystem 100 at step 1005, the suspension is 
made permanent by data storage subsystem 100 at step 
1009, Otherwise, data storage subsystem 100 
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permanently suspends the duplex copy group as in step 
1005- 

While a specific embodiment of this invention has 
been disclosed herein, it is expected that those 
5 skilled in the art can design other embodiments that 
differ from this particular embodiment but fall within 
the scope of the appended claims. 
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WE CLAIM: 

1. A disk memory system (100) for storing data 
records , which are transmitted to said disk memory 
system (100) by at least one associated host processor 
(11, 12), on one of a plurality of virtual data 
5 storage devices located in said disk memory system 
(100) and identified by said host processor (11, 12) 
comprising: 

a plurality of disk drives (12*-*) , a subset 
of said plurality of disk drives (12*-*) being 
10 configured into at least two redundancy groups (421 - 
428) , each of which includes at least two disk drives 
(12*-*) ; 

means (101) , responsive to the receipt of a 
stream of data records, for selecting available memory 
15 space in one of said redundancy groups (421) to store 
said received stream of data records thereon; 

means (104, 120, 121) for writing each of 
said received streams of data records and redundancy 
data associated with said received streams of data 

2 0 records in said selected available memory space; 

means (113) for maintaining data indicative 
of the correspondence between each of said received 
stream of data records and the identity of the disk 
drives (12*-*) in said selected redundancy group (421) 
25 on which each of said received streams of data records 
is stored; 

means (101, 1002 - 1009) , responsive to said 
host processor (11, 12) requesting activation of 
duplex copy group capability for designated primary 

3 0 (401) and secondary (402) virtual data storage devices 

in said disk memory system (100), for emulating said 
secondary virtual data storage device (402), including: 
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means (1002) for . storing data (431) 
indicative of the identity of said 

35 designated primary virtual data storage 

device (421) , including said correspondence 
data (411) which identifies said disk 
drives (421) on which each of said received 
streams of data records is stored, 

40 means (1003) for storing data (432) 

indicative of the identity of said 
designated secondary virtual data storage 
device (402) r including data (437) which 
identifies said disk drives (421) on which 

45 each of said received streams of data 

records is stored in said designated 
primary virtual data storage device (401), 
and 

means (204) , responsive to a query 
50 from said host processor (11, 12) to said 

designated secondary virtual data storage 
device (402) , for accessing said disk 
drives (421) of said designated primary 
virtual data storage device (401) . 

2. The system of claim 1 wherein said 
correspondence data comprises a set of pointers (411) 
which identify said selected available memory space in 
said selected redundancy group (421) , and said 
5 secondary virtual data storage device identity data 
comprises data (437) indicative of the identity of 
said designated primary virtual data storage device 
(401) , said emulating means (204, 1002 - 1009) further 
includes : 

10 means (1005) , responsive to said host 

processor (11, 12) transmitting a command to said disk 
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iaenory system (100) to discontinue duplex copy group 
operation, for creating a duplicate copy of said 
correspondence data (411) for said designated primary 
15 virtual data storage device (401) , 

means (414) for associating said copied 
correspondence data with said designated secondary 
virtual data storage device (402) , 

means (204) for deleting said stored data 
20 (437 ; indicative of the identity of said designated 
'prir..rr; virtual data storage device (401) from said 
scconnry virtual data storage device identity data 

(4 14 ; . anJ 

r.cans (101) , responsive to a query from said 
25 host -m-rsRor (11, 12) , for interpreting said 
duplicate copy (414) of said correspondence data (411) 
ar, r>a :rl r.#.-nndary virtual data storage device (402). 

"! . The system of claim 2 wherein said host 
proccs: r {11, 12) transmits a command to said disk 
Bicnory system (100) to reestablish said discontinued 
duplex copy group by deleting one of said primary 
5 (401 i an: secondary (402) virtual data storage 
devices, said emulating means (204, 1002 - 1009) 
further ;rcludes: 

means (1007) for deleting said 
correspondence data (411, 414) for said designated 
10 deleteJ virtual data storage device (401, 402) . 

4 . Th' system of claim 1 wherein said 
corresponded a data (411) comprises a set of pointers 
which identify said selected available memory space in 
said selected redundancy group (421) and said stored 
5 secondary virtual data storage device identity data 
(414) cocprises a copy of said correspondence data 
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(411) for said primary virtual data storage device 
(401), said emulating means (204 , 1002 - 1009) further 
comprises : 

10 means (1009) , responsive to said host 

processor (11, 12) transmitting a command to said disk 
memory system (100) to discontinue duplex copy group 
operation , for maintaining said primary (411) and 
secondary (414) virtual data storage device 

15 correspondence data independent of each other. 

5* The system of claim 1 wherein said 
correspondence data comprises a set of pointers (411) 
which identify said selected available memory space in 
said selected redundancy group (421) , and said 
5 secondary virtual data storage device identity data 
(432) comprises data (437) indicative of the identity 
of said designated primary virtual data storage device 
(401), said emulating means (204, 1002 - 1009) further 
includes : 

10 means (1005), responsive to said host 

processor (11, 12) transmitting a command to said disk 
memory system (100) to suspend duplex copy group 
operation, for creating data indicative of the 
identity of said correspondence data (411) for said 

15 designated primary virtual data storage device (401) , 

means (402) for associating said 
correspondence data identity data with said designated 
secondary virtual data storage device (402), 

means (1007) for deleting said stored data 

20 (437) indicative of the identity of said designated 
primary virtual data storage device (401) from said 
secondary virtual data storage device identity data 
(432), and 

means (101) , responsive to a query from said 
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25 host processor (11, 12) , for interpreting said 
correspondence data as said secondary virtual data 
storage device (402). 

6. The system of claim 1 further including: 
means (101, 104, 120, 121), responsive to 

the subsequent receipt of modifications to one of said 
data records stored in said designated primary virtual 
5 data storage device (401) from said host processor 
(11, 12), for writing said modified data record in 
available memory space in one of said redundancy 
groups (423) ; 

means (101, 113) for converting said memory 
10 space used to store said originally received data 
record to available memory space; and 

wherein said maintaining means (113) creates 
correspondence data indicative of the storage of said 
modified data record in said available memory space. 

7. The system of claim 1 further comprising: 
means (101, 103) for reserving at least one 

of said plurality of disk drives (12*-*) as backup 
disk drives (125-1 to 125-r) , which backup disk drives 
5 (125-1 to 125-r) are shared in common by said 
redundancy groups (421 - 428); 

means (101, 121) for identifying one of said 
at least two disk drives (12*-*) in one of said 
redundancy groups (421) that fails to function; and 
10 means (121) for switchably connecting one 

(125-1) of said backup disk drives (125-1 to 125-r) in 
place of said identified failed disk drive. 

8. The system of claim 7 further including; 
means (101) for reconstructing said stream 
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of data records written on said identified failed disk 
drive, using said associated redundancy data; and 
5 moans (104, 120, 121) for writing said 

reconstructed stream of data records on to said one 
backup d:rv drive (125-1). 

9 . The system of claim 8 wherein said 
reconstructing means (101) includes: 

trans (101, 113) for generating said stream 
of data records written on said identified failed : disk 
5 drive w-r.i said associated redundancy data and the 
data rcrcr^: written on the remaining disk drives in 
said rr.i'w*.;.inry group (421). 

ic. a Dcthod of storing data records on one of 
a plur.;l it; cf virtual data storage devices identified 
by at Icist cne associated host processor (11, 12) and 
in n di^> ccnory system (100), which data records are 
5 transmitted tc said disk memory system (100) by said 
host f;roccibor (11, 12), said disk memory system (100) 
having a plurality of disk drives (12*-*), a subset of 
said plurality of disk drives (12*-*) being configured 
into at least two redundancy groups (421 - 428), each 

10 of which includes at least two disk drives, comprising 
the steps cf : 

selecting, in response to the receipt of a 
streac ol data records from said associated host 
processor (11, 12), available memory space in one of 

15 said redundancy groups (421) to store said received 
strean of data records thereon; 

writing each of said received streams of 
data records and redundancy data associated with said 
received streams of data records in said selected 

20 available nemory space; 
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maintaining data indicative of the 
correspondence between each of said received streams 
of data records and the identity of the disk drives in 
said selected redundancy group (421) on which each of 
25 said received streams of data records is stored; 

emulating, in response to said host 
processor (11, 12) requesting activation of duplex 
copy group capability for designated primary (401) and 
secondary (402) virtual data storage devices in said 
30 disk memory system (100) , said secondary virtual data 
storage device (402) including: 

storing data (431) indicative of the 
identity of said designated primary virtual 
data storage device (401) including said 
35 correspondence data (411) which identifies 

said disk drives (421) on which each of 
said received streams of data records is 
stored, 

storing data (432) indicative of the 
4 0 identity of said designated secondary 

virtual data storage device (402) , 
/ including data (437) which identifies said 
disk drives (421) on which each of said 
received streams of data records is stored 
45 in said designated primary virtual data 

storage device (401) , and 

accessing, in response to a query from 
said host processor (11, 12) to said 
designated secondary virtual data storage 
50 device (402), said designated primary 

virtual data storage device (401) . 

11. The system of claim 10 wherein said 
correspondence data comprises a set of pointers (411) 



WO 92/22865 



PCT/TJS92/03653 



-54- 

which identify said selected available memory space in 
said selected redundancy group (421) , and said 
5 secondary virtual data storage device (402) identity 
data comprises data (437) indicative of the identity 
of said designated primary virtual data storage device 
(401) , wherein said step of emulating further 
includes : 

10 creating, in response to said host processor 

(11, 12) transmitting a command to said disk memory 
system (100) to discontinue duplex copy group 
operation, for creating data indicative of the 
identity of said correspondence data (411) foresaid 

15 designated primary virtual data storage device (401) , 

associating said correspondence data 
identity data with said designated secondary virtual 
data storage device (402) , 

deleting said stored data (437) indicative 

20 of the identity of said designated primary virtual 
data storage device (401) from said secondary virtual 
data storage device identity data (414), and 

interpreting, in response to a query from 
said host processor (11, 12) , said correspondence data 

25 as said secondary virtual data storage device (402). 

12. The system of claim 11 wherein said host 
processor (11, 12) transmits a command to said idisk 
memory system (100) to reestablish said discontinued 
duplex copy group by deleting one of said primary 
5 (401) and secondary (402) virtual data storage 
devices, said step of emulating further includes: 

deleting said correspondence data (411, 414) 
for said designated deleted virtual data storage 
device (401, 402). 
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13. The method of claim 10 wherein said 
correspondence data (411) comprises a set of pointers 
which identify said selected available memory space in 
said selected redundancy group (421) and said stored 

5 secondary virtual data storage device identity data 
(414) comprises a copy of said correspondence data 
(411) for said primary virtual data storage device 
(401) , said step of emulating further comprises: 

maintaining , in response to said host 
10 processor (11, 12) transmitting a command to said disk 
memory system (100) to discontinue duplex copy group 
operation, said primary* and secondary virtual data 
storage device correspondence data (411, 412) 
independent of each other. 

14 . The system of claim 10 wherein said 
correspondence data comprises a set of pointers (411) 
which identify said selected available memory space in 
said selected redundancy group (421) , and said 

5 secondary virtual data storage device identity data 
(414) comprises data indicative of the identity of 
said designated primary virtual data storage device 
(4 01) , said step of emulating further includes: 

creating, in response to said host processor 

10 (11/ 12) transmitting a command to said disk memory 
system (100) to suspend duplex copy group operation, 
for creating a duplicate copy (414) of said 
correspondence data (411) for said designated primary 
virtual data storage device (401), 

15 associating said copied correspondence data 

(414) with said designated secondary virtual data 
storage device (402) , 

deleting said stored data indicative of the 
identity of said designated primary virtual data 
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20 storage device (401) from said secondary virtual data 
storage device identity data (414) , and 

interpreting , in response to a query from 
said host processor (11, 12), said duplicate copy of 
said correspondence data as said secondary virtual 

25 data storage device (402). 

15 . The method of claim 10 further including the 
steps of: 

writing, in response to the subsequent 
receipt of modifications to one of said data records 
5 stored in said designated primary virtual data storage 
devices (401) from said host processor (11, 12), said 
modified data record in available memory space in one 
of said redundancy groups (421 - 428); 

converting said memory space used to store 
10 said originally received data record to available 
memory space; 

wherein said step of maintaining includes 
creating correspondence data indicative of the storage 
of said modified data record in said available memory 
15 space. 

16. The method of claim 10 further comprising 
the steps of: 

reserving at least one of said plurality of 
disk drives as backup disk drives (421 - 428), which 
5 backup disk drives (125-1 to 125-r) are shared in 
common by said redundancy groups (122-1 to 122-n+m, 
124-1 to 124-n+m) ; 

identifying one of said at least two disk 
drives (12*-*) in one of said redundancy groups 
10 (421) that fails to function; and 

switchably connecting one (12 5-1) of said 
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backup disk drives (125-1 to 125-r) in place of said 
identified failed disk drive. 

17. The method of claim 16 further including the 
steps of: 

reconstructing said stream of data records 
written on said identified failed disk drive, using 
5 said associated redundancy data; and 

writing said reconstructed stream of data 
records on : to said one backup disk drive (125-1). 

18. The method of claim 17 wherein said step of 
reconstructing includes : 

generating said stream of data records 
written on said identified failed disk drive using 
5 said associated redundancy data and the data records 
written on the remaining disk drives in said 
redundancy group (421) . 
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